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REMARKS 

Reconsideration and allowance in view of the foregoing amendment and the following 
remarks are respectfully requested. Claims 1-2, 13, 15 and 18 are amended without prejudice or 
disclaimer. 

Rejection of Claims 1-8 and 10-28 Under 35 U.S.C. §103(a) 

The Office Action rejects claims 1-8 and 10-28 under 35 U.S.C. § 103(a) as being 
unpatentable over Jain et al. (U.S. Patent No. 6,144,375) ("Jain et al") in view of Chen et al. 
(U.S. Patent No. 6,307,550) and further in view of Slezak (U.S. Patent No. 6,006,257) 
("Slezak"). Assignee traverses this rejection and shall explain why the combination of art fails to 
disclose or suggest each claim limitation. Furthermore, the complexity of the analysis, which 
takes almost 8 pages to complete, renders unpersuasive the assertion that it would be obvious to 
combine these three references. 

We first turn to claim 1 and the step of extracting image data from the plurality of still 
images from a subscriber. The Office Action addresses this limitation on page 3. However, in 
the analysis, the Office Action simply reads out of the limitation the word "still". The reference 
to an object extractor, filter, and so forth states that it draws this data "from the plurality of 
images received from camera(s)/video sources - . . .". However, in each cited portion of the 
reference, none of these involve a plurality of still images . Indeed, in each case, it is clear that 
the input to their system is video data. For example, cited column 6, lines 16-20, state 
"However, those skilled in the computer user interface art will realize that the present viewer can 
be adapted for use with any system that provides context-sensitive video , audio, and data 
information." (emphasis added). Throughout the cited portions, a video system and video data 
is described. See, e.g., column 5, line 59, column 6, line 28, column 6, line 37, and so forth. In 
the main example provided in cited column 6 - column 7, Figure 2 is references in which "an 
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American football game is captured by a plurality of cameras. . .and subsequently analyzed by a 
scene analysis sub-system 22." Column 7, lines 36-38 state that "If a video program is not in 
real-time (i.e., a television program) then it is possible to store an entire program in a video 
database." Assignee's basic point is that this reference clearly involves receiving and processing 
video information and not extracting image data from a plurality of still image data. Claim 1 is 
further amended to recite "wherein the plurality of still images are independent of a video 
sequence." This therefore eliminates the potential to interpret a plurality of still images as video 
data. This provides an underlying and fundamental distinguishing feature of claim 1 from Jain et 
al. 

The other portions of Jain et al. cited on page 3 also fail to match "extracting image data 
from the plurality of still images" because they deal with video. For example, columns 10-13 
discuss an environmental model (EM) process in which sensors or cameras are placed in various 
locations to videotape objects. Column 13, lines 34-36, state "As described below in more 
detail, users can supplement the audio/video information provided by the sensors 202 with 
additional information (e.g., statistical data, text, etc.)." Thus, columns 10 - 13 do not disclose 
or suggest extracting image data from a plurality of still images as defined in claim 1 . 

Finally, columns 18 and 19 also fail to disclose or suggest this feature inasmuch as this 
cited portion again references the example of the American football program in which the 
capture or filter process accepts "as input all of the video data streams provided by each video 
camera positioned approximate [to] a football field." Assignee has not found in the cited 
portions of this reference where a plurality of still images which are independent of a video 
sequence are received and image data is extracted therefrom. 

Furthermore, where a "still" image is referenced by Jain et al, it involves utilizing a still 
frame from video. Specifically, column 23, lines 21-25 explain that a control button can be used 
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which allows the user to "freeze a video frame and in essence 'take a snapshot' of the video 
image displayed in the video window 402. The still frame can be stored in the user's computer 
in a well-known format such as JPEG or TIFF formats." That clearly differs from what is recited 
in claim 1 . 

Similarly, column 30, lines 45-46, explain that "For example, a still video image or 
snapshot of the object (in this case, a selected football player) 501 is displayed that readily 
identifies the object to the user." Again, this clearly involves taking a snapshot of video which 
fundamentally differs from the concepts recited in claim 1 . 

Next, the step of deriving a virtual camera script is analyzed on pages 3 and 4 of the 

Office Action. There are numerous multi-column citations on page 4 which Assignee has 

reviewed. None of these disclose or suggest however "wherein the derived virtual camera script 

comprises a set of image processing instructions that simulates camera movement over portions 

of the plurality of still images." The reason for this is that there is simply no "camera 

movement " that is disclosed. The Office Action on page 4 highlights "such as best view camera 

or selected camera" as being the equivalent of camera movement. This is not the case because in 

each cited portion of the reference (such as columns 7, 15, 16, 18-20, 22- 24 and 26-31), in each 

case there are no cameras that move. Rather, a camera with the best view is simply selected. An 

example easily illustrates the point. Cited column 27, lines 14-62, discuss "alternate best views". 

In this case, the user can indicate a player of the football game and the system can flash 

"the camera button capable of providing the next best view. For example, when the user 
manual operates the viewer, the user manually selects a viewing perspective of a player 
by pressing a camera button. As the player moves, and as the play develops, the camera 
selected may no longer provide the best perspective from which to view the player. 
Therefore, depending upon the user query, the viewer automatically hints to the user that 
another camera may have a better perspective then the one presently being used in a 
'tracking' mode, the system will automatically switch control to the camera having the 
'best' viewing perspective. Depending upon the application, the viewer uses several 
criteria in determining the camera angle having the best view of the selected object, 
player, or event." 
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Thus, the concept of a "best viewed camera" is simply the concept of choosing a camera 
which has a good view of the object. One of skill in the art would understand that this means 
picking a camera. There simply is no disclosure of a simulation of camera movement . Each 
camera is disclosed in this case as being stationary in an actual camera. In this case, the concept 
disclosed in Jain et al. clearly differs from any type of simulation of camera movement . 
Especially movement over portions of a plurality of still images. 

Other cited portions of the reference also support this interpretation. For example, cited 

column 7, lines 25-34, also reference the "best camera." Here, they explain: 

"These images may all come from one perspective, or the MPI video system may have to 
select the best camera at every point in time in order to display the selected view and 
perspective. Accordingly, multiple cameras may be used to display a sequence over time, 
but at any given time only a single best camera is used. This requires the capability of 
solving a 'camera hand-off problem." 

Again, there is no camera movement that is contemplated. 

Additionally, the concept of a "selected camera" as discussed on page 4 of the Office 
Action means just that: the system of Jain et al. involve selecting a camera that has a good view 
of the object. This feature involving camera movement is not disclosed or suggested. 
Additionally, incorporating the arguments set forth above, Jain et al. clearly involves selecting 
cameras in order to follow a moving object. As has been noted above, these cameras are 
disclosed as video cameras and therefore, the additional concept in claim 1 of the derived 
"virtual camera script comprising a set of image processing instructions that simulates camera 
movement over portions of the plurality of still images " is clearly not disclosed or the equivalent 
of the concept of choosing a best view video camera to view a football game. These are very 
different concepts. 

Next, page 4 of the Office Action equates generating a video sequence based on the 
extracted image data, and derived virtual camera scripts and the coding hint as being disclosed in 
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Jain et al. in the various cited portions on page 5. However, as has been noted above, the 
viewer's input or selection of a best view camera that is disclosed in Jain et al. is clearly not 
drawn from "the extracted image data" which is drawn from a plurality of still images. Again, 
claim 1 is amended to recite wherein the plurality of still images are independent of a video 
sequence. Clearly, Jain et al. does not disclose such a feature and therefore the step of generating 
a video sequence as it is defined in claim 1 is not disclosed or suggested in the reference. 

The coding step on page 5 is also asserted to be disclosed in Jain et al. but for the same 
reasons set forth above, this limitation is not disclosed or suggested. The coding step requires 
coding the generated video sequence, wherein the coding hint references a coding process and a 
temporal evolution for each still image of the plurality of still images. This again takes it out of 
the scope of Jain et al. and even a broad reading of this claim limitation does not render it the 
equivalent of the videos disclosed in Jain et al. Accordingly, this limitation is not disclosed or 
suggested. Inasmuch as Jain et al. differs in numerous respects from the limitations of claim 1 as 
set forth above, Assignee submits that this claim is patentable and in condition for allowance 
based on this reasoning. 

Additionally, Assignee does not acquiesce that it would be obvious to combine Jain et al, 
Chen et al. and Slezak for the reasons set forth in the Office Action. Assignee reserves the right 
to argue against such combination. However, one argument is set forth next and it is noted that 
others could easily be developed. 

Specifically, pages 5 and 6 of the Office Action concede that "Jain does not explicitly 
disclose multimedia input from the subscriber, inserting a customized advertisement during the 
multimedia presentation, wherein the inserted customized advertisement includes an offer of an 
award to a user contingent, at least partly, on a user interaction with the customized 
advertisement." Assignee notes that this characterization of what Jain et al. fail to teach 
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fundamentally does differ from the limitation of the claims. For example, claim 1 recites 
receiving a multimedia data input from a subscriber including a plurality of still images that 
includes viewer-specific image data. In order to more clearly distinguish from Jain et al, this 
limitation has been amended to recite "receiving a plurality of still images, from a subscriber, 
that includes viewer-specific image data, wherein the plurality of still images are " independent of 
a video sequence ." It appears that pages 5-6 of the Office Action interprets that Jain et al. do not 
explicitly disclose "multimedia input" which would include video input. By focusing this feature 
of claim 1 on receiving a plurality of still images, the claim is even easier to distinguish from 
Jain et al. 

However, page 6 cites Chen et al. as disclosing a plurality of still images and concludes 
that it would be obvious of one of ordinary skill in the art at the time of the invention to modify 
Jain et al. to use the teaching as taught by Chen et al. in order to yield predictable results such as 
to provide multimedia input from subscriber to an output device thereby improving efficiency 
and multimedia generating. This analysis is so broad as to fail to comply with MPEP 2142 
which requires that the analysis regarding obviousness under Section 103 should be made 
explicit. There must be some articulated reasoning with some rational underpinning to support 
the legal conclusion of obviousness. In this case, one of skill in the art would not likely utilize 
Chen et al.'s teachings because cited column 2, lines 37-38, of Chen et al. involves still images 
generated from a "user supplied video [which] may be formatted into an electronic album of 
photographic images referred to herein as 'video album.'" Therefore, when page 6 of the Office 
Action states "Chen discloses multimedia input from a subscriber including a plurality of still 
images," this incorrectly interprets the disclosure of Chen et al. Chen et al. clearly discloses that 
the user supplies a video and that the system generates still images therefrom. This concept of 
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course is already found in Jain et al. when several places, discussed above, Jain et al. disclose a 
"still frame" which is generated as a snapshot from the video image. 

Therefore, even citing Chen et al. fails to match the limitation of receiving a plurality of 
still images from a subscriber. One of skill in the art would not likely incorporate the teachings 
of Chen et al. for this purpose inasmuch as they are duplicative of disclosure already existing in 
Jain et al. Slezak should also not be combined with Jain et al. in view of Chen et al. because its 
disclosure focuses on advertising and simply watching TV. In this case, simply watching a 
movie or TV which is already edited and prepared would not likely be a concept that would be 
combined with Jain et al. which involves a context such as a football game in which a "real 
world environment" is viewed from many viewing angles or input devices. Even a review of the 
Abstract of Jain et al. reveals that it is fundamentally different from the teachings of Slezak 
which is clearly focused on on-demand movies, games and television program watching. 
Accordingly, the preponderance of the evidence is against one of skill in the art likely combining 
the references in the manner proposed. Furthermore, as has been noted above, the particular 
analysis with respect to whether one of skill in the art would combine these references 
mischaracterizes the cited art as well as the particular claim limitations, which further weakens 
the persuasiveness of the analysis. 

Accordingly, for these several reasons, claim 1 is patentable and in condition for 
allowance as well as its dependent claims 2-8 and 10-28. 
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CONCLUSION 

Having addressed all rejections and objections, the subject application is in condition for 
allowance and a Notice to thai effect is earnestly solicited, if necessary, the Commissioner for 
Patents is authorized io charge or credk the Novak, Drace <& Quigg, LLP, Aceoaat No, 14-143 
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